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Abstract — Cooperative  spectrum  sensing  has  been  shown  to 
greatly  improve  the  sensing  performance  in  cognitive  radio 
networks.  However,  if  the  cognitive  users  belong  to  different 
service  providers,  they  tend  to  contribute  less  in  sensing  in  order 
to  achieve  a  higher  throughput.  In  this  paper,  we  propose  an 
evolutionary  game  framework  to  study  the  interactions  between 
selfish  users  in  cooperative  sensing.  We  derive  the  behavior 
dynamics  and  the  stationary  strategy  of  the  secondary  users, 
and  further  propose  a  distributed  learning  algorithm  that  helps 
the  secondary  users  approach  the  Nash  equilibrium  with  only 
local  payoff  observation.  Simulation  results  show  that  the  average 
throughput  achieved  in  the  cooperative  sensing  game  with  more 
than  two  secondary  users  is  higher  than  that  when  the  secondary 
users  sense  the  primary  user  individually  without  cooperation. 

I.  Introduction 

With  the  emergence  of  new  wireless  applications  and  de¬ 
vices,  the  last  decade  has  witnessed  a  dramatic  increase  in 
the  demand  for  radio  spectrum,  which  has  forced  government 
regulatory  bodies,  such  as  the  Federal  Communications  Com¬ 
mission  (FCC),  to  review  their  policies.  Since  the  allocated 
frequency  bands  to  some  licensed  spectrum  holders  experience 
very  low  utilization  [1],  the  FCC  has  been  considering  opening 
the  under-utilized  licensed  bands  to  secondary  users  on  an 
opportunistic  basis  with  the  aid  of  cognitive  radio  technology 
[2]. 

In  order  to  protect  the  primary  users  from  interference  due 
to  secondary  users’  operation,  spectrum  sensing  has  become 
an  essential  function  of  cognitive  radio  devices  [3].  Recently, 
cooperative  spectrum  sensing  with  relay  nodes’  help  and  multi¬ 
user  collaborative  sensing  has  been  shown  to  greatly  improve 
the  sensing  performance  [4]-[10].  In  [4],  the  authors  proposed 
collaborative  spectrum  sensing  to  combat  shadowing/fading 
effects.  [5]  proposed  light-weight  cooperation  based  on  hard 
decisions  to  reduce  the  sensitivity  requirements.  The  authors 
of  [6]  showed  that  cooperation  in  sensing  can  reduce  the 
detection  time  of  the  primary  user  and  increase  the  overall 
agility.  How  to  choose  the  secondary  users  for  cooperation 
was  investigated  in  [7].  The  authors  of  [8]  studied  the  design 
of  sensing  slot  duration  to  maximize  the  secondary  through¬ 
put.  Two  energy-based  cooperative  detection  methods  using 
weighted  combining  were  analyzed  in  [9].  Spatial  diversity  in 
multiuser  networks  to  improve  spectrum  sensing  capabilities 
of  centralized  cognitive  radio  networks  were  exploited  in  [10]. 

In  most  of  the  existing  cooperative  spectrum  sensing 


schemes  [4]- [10],  it  is  generally  assumed  that  all  secondary 
users  belong  to  the  same  authority.  They  will  voluntarily  fuse 
their  sensing  outcomes  to  a  centralized  controller  (e.g.,  the 
secondary  base  station),  which  makes  a  final  decision  on 
whether  the  primary  user  is  present  or  not.  However,  with  the 
emerging  applications  of  mobile  ad  hoc  networks  envisioned 
in  civilian  usage,  the  secondary  users  may  be  selfish  and  do 
not  serve  a  common  goal.  Sensing  a  licensed  frequency  band 
also  consumes  a  certain  amount  of  energy  and  time  which 
may  alternatively  be  diverted  to  data  transmissions.  If  multiple 
secondary  users  occupy  different  sub-bands  of  one  primary 
user  and  can  overhear  the  other  users’  sensing  outcomes,  they 
tend  to  take  advantage  of  the  others  and  wait  for  the  others 
to  sense  the  primary  user  so  as  to  reserve  more  time  for  their 
own  data  transmission. 

In  order  to  study  the  interactions  between  the  selfish  users 
and  their  stationary  strategy  in  the  long  run,  in  this  paper 
we  propose  to  model  the  cooperative  spectrum  sensing  as  an 
evolutionary  game.  If  some  secondary  users  agree  to  cooperate 
in  sensing,  the  cost  can  be  equally  shared  among  them,  while 
the  users  who  do  not  take  part  in  cooperative  sensing  can 
enjoy  a  free  ride.  However,  if  no  user  senses  the  primary  user, 
then  all  of  them  will  be  punished  by  a  very  low  payoff.  By 
using  replicator  dynamics  [14],  we  obtain  the  equations  that 
govern  the  users’  behavior  dynamics,  and  further  derive  the 
equilibrium  strategy  when  all  secondary  users  are  assumed  ho¬ 
mogeneous  in  their  individual  data  rates  and  the  received  SNRs 
of  the  primary  user  (e.g.,  the  secondary  users  are  located  far 
away  from  the  primary  base  station  and  clustering  together). 
Moreover,  we  develop  a  distributed  learning  algorithm  that 
can  help  the  secondary  users  approach  their  optimal  strategy 
with  only  their  own  payoff  history.  Simulation  results  show 
that  as  the  number  of  secondary  users  and  the  cost  of  sensing 
increases,  the  users  tend  to  have  less  incentive  to  contribute 
to  the  cooperative  sensing.  However,  they  can  still  achieve  a 
higher  average  throughput  in  the  spectrum  sensing  game  than 
that  of  the  single-user  sensing,  if  there  are  more  than  two 
secondary  users  in  the  cognitive  radio  network. 

The  rest  of  this  paper  is  organized  as  follows.  The  system 
model  is  presented  in  Section  II.  In  Section  III,  we  formulate 
the  cooperative  spectrum  sensing  as  an  evolutionary  game, 
analyze  the  behavior  dynamics  of  the  secondary  users,  and 
develop  a  distributed  learning  algorithm  that  approaches  equi- 
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librium.  Simulation  results  are  shown  in  Section  IV.  Finally, 
Section  V  concludes  the  paper. 


II.  System  Model 


r(t)  = 


(1) 


A.  Hypothesis  of  Channel  Sensing 

When  a  secondary  user  is  sensing  the  licensed  spectrum 
channel  in  a  cognitive  radio  network,  the  received  signal  r(t) 
from  the  detection  has  two  hypotheses  when  the  primary  user 
is  present  or  absent,  denoted  by  Hi  and  Hq,  respectively.  Then, 
r(t)  can  be  written  as 

hs(t)  +  w(t),  if  Hi; 
v  w(t),  if  H0. 

In  (1),  h  is  the  gain  of  the  channel  from  the  primary  user’s 
transmitter  to  the  secondary  user’s  receiver;  s(t)  is  the  signal 
of  the  primary  user,  which  is  assumed  to  be  an  i.i.d.  random 
process  with  mean  zero  and  variance  w(t )  is  an  additive 

white  Gaussian  noise  (AWGN)  with  mean  zero  and  variance 
cr^.  s(t)  and  w(t)  are  assumed  to  be  mutually  independent. 

Assume  we  use  an  energy  detector  to  sense  the  licensed 
spectrum,  then  the  test  statistics  T(r)  is  defined  as 


N 


T(r)  = 


(2) 


tm l 


where  N  is  the  number  of  collected  samples. 

The  performance  of  licensed  spectrum  sensing  is  character¬ 
ized  by  two  probabilities,  the  probability  of  detection,  Pd, 
and  the  probability  of  false  alarm,  Pp.  If  the  noise  term 
w(t)  is  assumed  to  be  circularly  symmetric  complex  Gaussian 
(CSCG),  the  probability  of  false  alarm  Pp  is  given  by  [12] 

-Pf(A)  =  q(  ( f —  i')  Cn 


(3) 

where  A  is  the  threshold  of  the  energy  detector,  and  Q(-) 
denotes  the  complementary  distribution  function  of  the  stan¬ 
dard  Gaussian.  Similarly,  if  we  assume  the  primary  signal  is 
a  complex  PSK  signal,  then  the  probability  of  detection  Pd 
can  be  approximated  by  [12] 


PdW  =  Q[  (A -7-  1 


,  erf, 


(4) 


\h\2a2 

where  7  =  1  5  denotes  the  received  signal-to-noise  ratio 

(SNR)  of  the  primary  user  under  Hi. 

Given  a  target  detection  probability  Pd,  the  threshold  A 
can  be  derived,  and  the  probability  of  false  alarm  Pp  can  be 
further  rewritten  as 

PF(PD,N,y)  =  Q  [y^p  +  lQ-1' 


(5) 


^Pp  +  VN'rj 

where  Q_1(-)  denotes  the  inverse  function  of  Q(-). 

B.  Throughput  of  a  Secondary  User 

When  sensing  the  primary  user’s  activity,  the  secondary 
users  cannot  perform  data  transmission  at  the  same  time.  If  we 
denote  the  sampling  frequency  by  fs  and  the  frame  duration 
by  T,  then  the  time  duration  for  data  transmission  is  given 
by  T  —  S(N),  where  S(N)  =  jf-  represents  the  time  spent  in 
sensing.  When  the  primary  user  is  absent  and  no  false  alarm 
is  generated,  the  average  throughput  of  the  secondary  user  is 


Rh0(n)  = - tJ1 — “(!  -  Pf)CHq 


(6) 
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Fig.  1:  System  model 

where  Ch0  represents  the  data  rate  of  the  secondary  user  under 
Hq.  When  the  primary  user  is  present  while  not  detected  by 
the  secondary  user,  the  average  throughput  of  the  secondary 

user  is  T  -  S(N) 

RHl  (AO  = - (1  -  Pd)CHi  ,  (7) 

where  Ch1  represents  the  data  rate  of  the  secondary  user  under 
Hi. 

If  we  denote  Ph0  as  the  probability  that  the  primary  user 
is  absent,  then  the  total  throughput  of  the  secondary  user  is 
R(N)  =  PHoRHo(N )  +  (1  -  PHo)RHl(N).  (8) 
Then,  from  the  secondary  user’s  perspective,  he/she  wants  to 
maximize  his/her  total  throughput  (8),  given  that  Pd  >  Pd. 
As  mentioned  in  [8],  in  practice  the  target  detection  probability 
Pd  are  required  by  the  primary  user  to  be  close  to  1 ;  moreover, 
we  usually  have  Pr0  close  to  1  and  Ch1  <  Ch0  (due  to 
the  interference  from  the  primary  user  to  the  secondary  user). 
Therefore,  (8)  can  be  approximated  by 

R(N)  =  PHoRHo (N)  =  PHo  T~j!(iV)(l  -  Pf)CHo  ■  (9) 

We  know  from  (9)  that  there  is  a  tradeoff  for  a  secondary  user 
to  choose  an  optimal  N  that  maximizes  the  throughput  R(N). 
In  order  to  keep  a  low  Pp  with  a  smaller  N,  a  good  choice  is 
cooperative  spectrum  sensing  with  the  other  secondary  users 
in  the  same  licensed  band. 

III.  Spectrum  Sensing  Game 
A.  Problem  Formulation 

A  snapshot  of  a  cognitive  radio  network  is  shown  in  Fig. 
1,  where  the  secondary  users  are  clustering  together,  but  far 
away  from  the  primary  base  station.  The  cooperative  spectrum 
sensing  is  shown  in  Fig.  2.  We  assume  that  the  entire  licensed 
band  is  divided  into  K  sub-bands,  and  each  secondary  user 
operates  exclusively  in  one  of  the  K  sub-bands  when  the 
primary  user  is  absent.  The  transmission  time  is  slotted  into 
intervals  of  length  T.  Before  each  data  transmission,  the 
secondary  users  need  to  sense  the  primary  user’s  activity.  The 
secondary  users  can  jointly  sense  the  primary  user’s  presence, 
and  exchange  their  sensing  results  via  a  narrow-band  signalling 
channel,  as  shown  in  Fig  2.  In  this  way,  each  of  them  can 
spend  less  time  detecting  while  enjoying  a  low  false  alarm 
probability  Pp  via  some  decision  fusion  rule  [11],  and  the 
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Fig.  2:  Cooperative  spectrum  sensing 

spectrum  sensing  cost  (TV,  or  5  (TV))  can  be  shared  by  whoever 
is  willing  to  contribute  (C). 

However,  the  secondary  users  may  not  serve  for  a  common 
authority,  and  they  will  act  selfishly  by  pursuing  as  high  a 
throughput  as  possible.  Once  a  secondary  user  is  able  to 
overhear  the  detection  results  from  the  other  users,  he/she 
tends  to  take  advantage  of  that  by  denying  (D)  to  take  part 
in  spectrum  sensing.  In  this  scenario,  each  secondary  user  in 
the  cognitive  radio  network  still  achieves  the  same  false  alarm 
probability  Pp,  while  the  users  who  deny  to  join  in  cooperative 
sensing  will  have  more  time  for  their  own  data  transmission. 
The  secondary  users  are  punished  by  a  very  low  throughput  if 
no  one  cooperates  in  sensing,  in  the  hope  that  someone  else 
will  sense  the  spectrum. 

Therefore,  we  can  model  the  spectrum  sensing  as  a  non- 
cooperative  game.  The  players  of  the  game  are  the  secondary 
users,  denoted  by  S  =  {s  1,  •  •  •  Each  player  s*.  has  the 

same  action/strategy  space,  denoted  by  A  =  {C,D}.  The  pay¬ 
off  function  is  defined  as  the  throughput  of  the  secondary  user. 
Assume  that  the  secondary  users  contributing  in  cooperative 
sensing  forms  a  set,  denoted  by  <SC  =  {si,  •  •  •  ,  sj}.  Denote  the 
false  alarm  probability  of  the  cooperative  sensing  among  set 
Sc  with  fusion  rule  “RULE”  and  a  target  detection  probability 

Pd  by  Ppc  =  Pf(Pd,N ,  (7*  }iesc,  RULE).  Then  the  payoff 
for  a  contributor  Sj  G  Sc ,  can  be  defined  as 

Uc,Sj  =  Ph0  (i  -  (1  -  P§°)Caj,  if  |SC|  e  [1,  K], 

where  |<SC|,  i.e.,  the  cardinality  of  set  Sc,  represents  the  nurrAe^ 
of  contributors,  and  CSj  is  the  data  rate  for  user  s3  under 
hypothesis  H0.  Here  we  assume  that  the  cost  of  sensing,  S(N), 
is  equally  shared  by  all  contributors,  and  TV  is  a  large  number 
agreed  by  the  group  of  contributors  to  guarantee  a  low  Pp. 
The  payoff  for  a  denier  £  <SC,  who  selects  strategy  D,  is 
then  given  by 

UD,Si  =  Pffo(l  -  Ppc)CSi,  if  \SC\  €  [1  ,K-  1],  (11) 

since  Si  will  not  spend  time  in  sensing.  If  no  secondary  user 
contributes  to  spectrum  sensing  and  waits  for  the  others  to 
sense,  i.e.,  \SC\  =  0,  from  (5),  we  know  that  limjv^o  Pf  = 
1,  especially  for  the  low  received  SNR  regime  and  high  Pp> 
requirement.  In  this  case,  the  payoff  for  a  denier  becomes 

UD,Si=  0,  if  |<SC|  =  0.  (12) 


The  decision  fusion  rule  can  be  selected  from  the  logical-OR 
rule,  logical- AND  rule,  and  majority  rule  [8].  In  this  paper, 
we  mainly  focus  on  the  logical-OR  rule  to  derive  the  Ppc ,  but 
the  other  fusion  rules  could  be  similarly  analyzed.  Denote  the 
detection  and  false  alarm  probability  for  a  contributor  Sj  G  Sc 
by  Pd,sj  and  PpjS  .,  respectively.  Then,  under  OR  rule  we 
have  the  following 

PD  =  1-  Y[(l-pD,Sj),  (13) 

S  j  €zSc 

and  pF=\-  n  a-fW).  d4) 

Sj  E<SC 

Hence,  given  a  Pd  for  set  Sc,  each  individual  user’s  target 
detection  probability  can  be  expressed  as 

Pd, s,  =1-(1-Pd)(1/I5cI).  (15) 

Then,  from  (5)  we  can  write  Pp,8j  as 

Pf,.,  =  Q  (y27s3.  +  1  Q-HPd,.,)  +  VN/\sc\lsj)  ,  (16) 

and  can  further  obtain  P^c  by  substituting  (16)  in  (14). 

B.  Analysis  of  the  Game 

Since  the  data  transmission  for  each  secondary  user  is 
continuous,  the  spectrum  sensing  game  is  played  repeatedly 
and  will  evolve  over  time.  Therefore,  we  can  use  evolutionary 
game  theory  to  analyze  the  evolutionary  dynamics  of  the 
players  and  further  derive  the  equilibrium  [14]. 

1 )  Evolution  Dynamics  of  the  Sensing  Game:  The  devel¬ 
opment  of  evolutionary  game  theory  is  a  major  contribution  of 
biology  to  competitive  decision  making.  The  key  concept  of 
evolutionary  game  is  replicator  dynamics ,  which  describes  the 
evolution  of  strategies  in  time.  Specifically,  consider  a  large 
population  of  homogeneous  individuals  who  are  programmed 
to  the  same  set  of  pure  strategies  A  in  a  symmetric  game 
with  payoff  function  U.  At  time  t,  let  pai(t)  >  0  be  the 
number  of  individuals  who  are  currently  programmed  to  pure 
strategy  G  A,  and  let  p(t)  =  e A Paiif)  >  0  be  the  total 

population.  Then  the  associated  population  state  is  defined 
as  the  vector  x(t)  =  {xai (£),•••  ,x\A\(t)},  where  xa.(t)  is 
defined  as  the  population  share  XaM  =  Pai(t)/P(t).  By 
replicator  dynamics,  the  evolution  dynamics  of  xai  (t)  is  given 
by  the  following  differential  equation 

xai  =  e[U(ai,x-ai )  -  U(x)]xai,  (17) 

where  f7(a^,x_a.)  is  the  instantaneous  average  payoff  of 
the  individuals  using  a*,  U(x)  is  the  instantaneous  average 
payoff  of  the  whole  population,  and  e  is  some  positive  number 
representing  the  time  scale.  The  intuition  behind  (17)  is  as 
follows:  if  strategy  ai  results  in  a  higher  payoff  than  the 
average  level,  the  population  share  using  will  grow,  and  the 
growth  rate  xaijxai  is  proportional  to  the  difference  between 
strategy  afs  current  payoff  and  the  current  average  payoff  in 
the  entire  population.  By  analogy,  we  can  view  xai  (t)  as  the 
probability  that  one  player  in  a  symmetric  game  adopts  pure 
strategy  and  x(t)  can  be  equivalently  viewed  as  a  mixed 
strategy  for  that  player. 

Then,  we  can  generalize  (17)  to  the  spectrum  sensing  game 
with  heterogeneous  players,  as  CSi  may  vary  among  different 
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users.  Denote  the  probability  that  user  Sj  adopts  strategy  h  G 
A  at  time  t  by  Xh,Sj  (t),  then  the  time  evolution  of  Xh,3j  (t)  is 
governed  by  the  following  differential  equation: 

%h,Sj  =  jj  ^  [USj(h,X-Sj)  —  USj(x)]  Xhsj i  (18) 

where  USj(h,x-Sj)  is  the  average  payoff  for  player  Sj  using 
pure  strategy  h ,  and  USj  (x)  is  Sj’ s  average  payoff  using  mixed 
strategy  xSj . 

2)  Equilibrium  Analysis:  If  each  user  s3  maximizes  his/her 
total  payoff  by  choosing  the  optimal  probability  of  being 
a  contributor  (or  a  denier),  Xh,3j,  where  h  =  C  (or  D), 
the  outcome  of  the  game  can  be  characterized  by  the  Nash 
Equilibrium  [14].  In  Nash  equilibria  (NE),  no  player  can 
gain  a  higher  payoff  value  by  unilaterally  deviating  from  the 
equilibrium  strategy,  given  that  the  other  players  adopt  their 
equilibrium  strategies. 

The  steady- state  solution  to  (18)  given  any  initial  condition 
is  defined  as  the  evolutionary  stable  strategy  (ESS).  It  is  shown 
[14]  that  the  ESS  is  a  refinement  of  NE.  It  is  generally  difficult 
to  solve  equation  (18)  and  obtain  the  equilibrium  of  the  game 
if  the  number  of  users  is  large.  Therefore,  in  this  section,  we 
first  analyze  a  special  symmetric  sensing  game  to  get  some 
insight,  and  next  develop  a  distributed  learning  algorithm  for 
the  players  to  achieve  the  NE  in  the  long  run. 

As  shown  in  Fig.  1,  all  the  secondary  users  are  assumed 
to  be  located  far  away  from  the  primary  base  station  and 
clustering  together,  so  the  received  7Sj’s  are  very  low  and 
similar  to  each  other.  In  order  to  guarantee  low  Pp  given  a 
target  Pp>,  the  number  of  sampled  signals  N  should  be  large. 
Under  these  assumptions,  we  can  approximately  view  P^c  as 
the  same  for  different  <Sc’s,  denoted  by  Pp.  Further  assume 
that  all  users  have  the  same  data  rate,  i.e.  Cs  =  C,  for  all 
Si  G  S.  Then,  the  payoff  functions  defined  in  (10)-(12)  become 

UC(J)  -  U0  (l  -  I)  ,  if  J  e  [1,K\,  (19) 


and 


UD(J) 


U0,  if  J  e[i,K-i]-, 

0,  if  J  =  0, 


(20) 


where  U0  =  PHo(l  -  PF)C,  J  =  |<SC|,  and  r  = 

According  to  the  symmetric  setting,  (17)  can  be  applied 
to  the  special  case  as  all  players  have  the  same  evolution 
dynamics  and  equilibrium  strategy.  Denote  x  as  the  probability 
that  one  secondary  user  contributes  to  spectrum  sensing,  then 
the  average  payoff  for  pure  strategy  C  can  be  obtained  as 


optimal  probability  of  cooperation  vs.  x 


Fig.  3:  Probability  of  being  a  contributor  vs.  r 


Since  the  average  payoff  U  = 

becomes  . 

x  =  ex(l  —  x 


xUc  +  (1  - 

)(Uc-UD )• 


x)Ud ,  then  (17) 
(23) 


In  equilibrium  x*9  any  player  will  not  deviate  from  the  optimal 
strategy,  indicating  x*  =  0,  or  =  Up.  Then,  by  equating 
(21)  and  (22),  we  can  have  the  following  Tfth-order  equation 

t(1-x*)k +Kx*(1-x*)k~1  -t  =  0,  (24) 


and  further  solve  the  equilibrium. 

3)  Learning  Algorithm  for  Nash  Equilibrium:  From  (18), 
we  can  derive  the  strategy  adjustment  for  the  secondary  user 
as  follows.  Denote  the  pure  strategy  taken  by  user  Sj  at  time 
t  by  ASj  (t).  Define  an  indicator  function  lj.  (t)  as 


1, 

0, 


if  ASj  (: t )  =  h\ 
if  ASj  (t)  £  h. 


(25) 


At  some  interval  mT,  we  can  approximate  USj(h,x-Sj)  by 


j-j  /,  s.  .  Et<mr  7  W,  W)1  (f) 
USj(h,x_Sj)  =  ^ - = - -j— - (26) 

2^t<mT  ±sj  \L) 

where  USj(ASj(t),A-Sj(t))  is  the  payoff  value  for  Sj  as 
determined  by  (10)-(12).  Similarly,  USj(x)  is  approximated 

by  -  i 

=  -  E  U'M'MA-sM-  <27) 


t<mT 


Then,  the  derivative  Xh,sj  ( mT )  can  be  approximated  by 
substituting  (26)  and  (27)  into  (18).  Therefore,  the  probability 
of  user  Sj  taking  action  h  can  be  adjusted  by 


Xh,Sj  (( m  +  1)T)  =  xh:Sj  (mT)  +  rjSjxh,Sj  (mT) ,  (28) 


x\l  -  x)K  1  JUc{j  +  1), 


with  r]Sj  being  the  step  size  of  adjustment  chosen  by  Sj.  We 
(21)  wiH  demonstrate  the  convergence  of  the  learning  algorithm  in 
the  next  section. 


where  (K“1)xJ(l  —  x)K~1~j  is  the  probability  that  J  +  1 
users  contributes  to  cooperative  sensing.  Similarly,  the  average 
payoff  for  pure  strategy  D  is  given  by 

K  —1  /  jrr  _  -j  \ 

Ud  =  Yu  \  ■  j^il-x^-^UnU).  (22) 

3= 0  k  ^  / 


IV.  Simulation  Results  and  Analysis 

The  parameters  used  in  the  simulation  are  as  follows.  We 
assume  that  the  primary  signal  is  a  baseband  QPSK  modulated 
signal,  the  sampling  frequency  is  fs  =  4MHz,  and  the  frame 
duration  is  T  =  5  ms.  The  probability  that  the  primary  user  is 
inactive  is  set  as  Pp0  =0.9,  and  the  required  target  detection 
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average  throughput  per  user  vs.  x 


Fig.  4:  Average  throughput  per  user  vs.  r 

probability  Pd  is  0.95.  The  noise  is  assumed  to  be  a  zero- 
mean  CSCG  process.  The  received  /ySj  ’s  are  in  the  low  SNR 
regime,  with  an  average  value  of  —12  dB. 

We  first  illustrate  the  optimal  equilibrium  strategy  for  the 
secondary  users  assuming  a  homogeneous  setting  as  in  Section 
III-B2,  where  the  data  rate  is  C  =  1  Mbps.  In  Fig.  3,  we 
show  the  optimal  probability  of  being  a  contributor  x*  for  a 
network  with  different  number  of  secondary  users.  The  x-axis 
represents  r  =  the  ratio  of  sensing  time  duration  over 

the  frame  duration.  From  Fig.  3,  we  can  see  that  x*  decreases 
as  r  increases.  For  the  same  r,  x*  decreases  as  the  number 
of  secondary  users  increases.  This  indicates  that  the  incentive 
of  contributing  to  the  cooperative  sensing  drops  as  the  cost  of 
sensing  increases  and  more  users  exist  in  the  network.  This  is 
because  the  players  tend  to  wait  for  someone  else  to  sense  the 
spectrum  and  can  then  enjoy  a  free  ride,  when  they  are  faced 
with  a  high  sensing  cost  and  more  counterpart  players. 

In  Fig.  4,  we  show  the  average  throughput  per  user  when 
all  users  adopt  the  equilibrium  strategy.  We  see  that  there  is  a 
tradeoff  between  the  cost  of  sensing  and  the  throughput.  The 
optimal  value  of  r  is  around  0.15,  and  will  slightly  increase 
as  the  number  of  user  increases.  This  is  because  the  false 
alarm  probability  Pp  in  (14)  tends  to  increase  as  the  number 
of  user  increases.  In  order  to  have  a  low  Pp,  the  users  need  to 
collect  more  samples  for  better  detection.  Although  the  cost 
of  sensing  increases,  as  more  users  share  the  sensing  cost, 
the  optimal  average  throughput  per  user  still  increases.  We 
also  plot  the  optimal  throughput  for  the  single-user  sensing 
(dotted  line  “single”)  for  comparison.  It  is  interesting  that  the 
average  throughput  values  for  games  with  more  than  2  users 
are  all  higher  than  that  of  the  single-user  sensing,  while  the 
throughput  for  the  2-user  game  is  not.  The  reason  is  that  when 
there  are  more  than  2  users  in  the  game,  the  chance  that  no 
user  contributes  to  sensing  is  smaller;  it  is  more  likely  that 
neither  user  senses  the  spectrum  in  the  2-user  game. 

We  finally  show  the  learning  curve  for  the  probability  of 
being  a  contributor  in  a  3 -player  game  in  Fig.  5,  with  r  =  0.5, 
the  step-size  of  learning  rjSj  =  0.002,  71  =  —13  dB,  72  =  —12 
dB,  and  71  =  —11  dB.  We  see  that  in  the  long  run,  all  three 
users  can  gradually  reach  the  equilibrium  strategy,  which  is 


Probability  of  cooperation  vs.  number  of  iterations 


Fig.  5:  Learning  curve  for  a  3-user  sensing  game 
about  0.44. 

V.  Conclusion 

In  this  paper,  we  propose  an  evolutionary  game-theoretical 
framework  for  distributed  cooperative  sensing  over  cogni¬ 
tive  radio  networks.  By  employing  the  theory  of  replicator 
dynamics,  we  study  the  behavior  dynamics  of  secondary 
users,  and  further  propose  a  distributed  learning  algorithm 
that  gradually  converges  to  the  Nash  equilibrium.  From  the 
simulation  results,  the  average  throughput  per  user  in  a  K- 
user  sensing  game  ( K  >  2)  is  still  higher  than  that  in  the 
single-user  sensing. 
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